Overview

Dataset statistics

Number of variables27
Number of observations50000
Missing cells30053
Missing cells (%)2.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory10.3 MiB
Average record size in memory216.0 B

Variable types

Categorical19
Numeric8

Alerts

ID has a high cardinality: 50000 distinct values High cardinality
Customer_ID has a high cardinality: 12500 distinct values High cardinality
Name has a high cardinality: 10139 distinct values High cardinality
Age has a high cardinality: 976 distinct values High cardinality
SSN has a high cardinality: 12501 distinct values High cardinality
Annual_Income has a high cardinality: 16121 distinct values High cardinality
Num_of_Loan has a high cardinality: 263 distinct values High cardinality
Type_of_Loan has a high cardinality: 6260 distinct values High cardinality
Num_of_Delayed_Payment has a high cardinality: 443 distinct values High cardinality
Changed_Credit_Limit has a high cardinality: 3927 distinct values High cardinality
Outstanding_Debt has a high cardinality: 12685 distinct values High cardinality
Credit_History_Age has a high cardinality: 399 distinct values High cardinality
Amount_invested_monthly has a high cardinality: 45450 distinct values High cardinality
Monthly_Balance has a high cardinality: 49433 distinct values High cardinality
Num_Bank_Accounts is highly correlated with Interest_Rate and 1 other fieldsHigh correlation
Interest_Rate is highly correlated with Num_Bank_Accounts and 2 other fieldsHigh correlation
Delay_from_due_date is highly correlated with Credit_Mix and 1 other fieldsHigh correlation
Num_Credit_Inquiries is highly correlated with Interest_RateHigh correlation
Credit_Mix is highly correlated with Delay_from_due_dateHigh correlation
Payment_of_Min_Amount is highly correlated with Delay_from_due_dateHigh correlation
Name has 5015 (10.0%) missing values Missing
Monthly_Inhand_Salary has 7498 (15.0%) missing values Missing
Type_of_Loan has 5704 (11.4%) missing values Missing
Num_of_Delayed_Payment has 3498 (7.0%) missing values Missing
Num_Credit_Inquiries has 1035 (2.1%) missing values Missing
Credit_History_Age has 4470 (8.9%) missing values Missing
Amount_invested_monthly has 2271 (4.5%) missing values Missing
Monthly_Balance has 562 (1.1%) missing values Missing
ID is uniformly distributed Uniform
Customer_ID is uniformly distributed Uniform
Month is uniformly distributed Uniform
Annual_Income is uniformly distributed Uniform
Outstanding_Debt is uniformly distributed Uniform
Monthly_Balance is uniformly distributed Uniform
ID has unique values Unique
Credit_Utilization_Ratio has unique values Unique
Num_Bank_Accounts has 2166 (4.3%) zeros Zeros
Delay_from_due_date has 626 (1.3%) zeros Zeros
Num_Credit_Inquiries has 1102 (2.2%) zeros Zeros
Total_EMI_per_month has 5002 (10.0%) zeros Zeros

Reproduction

Analysis started2022-11-29 10:15:14.164657
Analysis finished2022-11-29 10:15:39.210009
Duration25.05 seconds
Software versionpandas-profiling v3.3.0
Download configurationconfig.json

Variables

ID
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct50000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
0x289d
 
1
0x20487
 
1
0x23501
 
1
0x168d1
 
1
0x20186
 
1
Other values (49995)
49995 

Length

Max length7
Median length7
Mean length6.60068
Min length6

Characters and Unicode

Total characters330034
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique50000 ?
Unique (%)100.0%

Sample

1st row0x160a
2nd row0x160b
3rd row0x160c
4th row0x160d
5th row0x1616

Common Values

ValueCountFrequency (%)
0x289d1
 
< 0.1%
0x204871
 
< 0.1%
0x235011
 
< 0.1%
0x168d11
 
< 0.1%
0x201861
 
< 0.1%
0x1cb581
 
< 0.1%
0x89a21
 
< 0.1%
0x161bb1
 
< 0.1%
0x17b521
 
< 0.1%
0x1d22f1
 
< 0.1%
Other values (49990)49990
> 99.9%

Length

2022-11-29T15:45:39.342609image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0x289d1
 
< 0.1%
0x24c31
 
< 0.1%
0xe8191
 
< 0.1%
0x172cc1
 
< 0.1%
0x24bed1
 
< 0.1%
0x5e0a1
 
< 0.1%
0x13da21
 
< 0.1%
0x14d711
 
< 0.1%
0xecbb1
 
< 0.1%
0x6d311
 
< 0.1%
Other values (49990)49990
> 99.9%

Most occurring characters

ValueCountFrequency (%)
062051
18.8%
x50000
15.1%
134749
10.5%
221607
 
6.5%
313419
 
4.1%
413417
 
4.1%
513415
 
4.1%
912141
 
3.7%
c12141
 
3.7%
812139
 
3.7%
Other values (7)84955
25.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number207212
62.8%
Lowercase Letter122822
37.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
062051
29.9%
134749
16.8%
221607
 
10.4%
313419
 
6.5%
413417
 
6.5%
513415
 
6.5%
912141
 
5.9%
812139
 
5.9%
612139
 
5.9%
712135
 
5.9%
Lowercase Letter
ValueCountFrequency (%)
x50000
40.7%
c12141
 
9.9%
b12139
 
9.9%
e12139
 
9.9%
d12135
 
9.9%
a12135
 
9.9%
f12133
 
9.9%

Most occurring scripts

ValueCountFrequency (%)
Common207212
62.8%
Latin122822
37.2%

Most frequent character per script

Common
ValueCountFrequency (%)
062051
29.9%
134749
16.8%
221607
 
10.4%
313419
 
6.5%
413417
 
6.5%
513415
 
6.5%
912141
 
5.9%
812139
 
5.9%
612139
 
5.9%
712135
 
5.9%
Latin
ValueCountFrequency (%)
x50000
40.7%
c12141
 
9.9%
b12139
 
9.9%
e12139
 
9.9%
d12135
 
9.9%
a12135
 
9.9%
f12133
 
9.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII330034
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
062051
18.8%
x50000
15.1%
134749
10.5%
221607
 
6.5%
313419
 
4.1%
413417
 
4.1%
513415
 
4.1%
912141
 
3.7%
c12141
 
3.7%
812139
 
3.7%
Other values (7)84955
25.7%

Customer_ID
Categorical

HIGH CARDINALITY
UNIFORM

Distinct12500
Distinct (%)25.0%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
CUS_0x2425
 
4
CUS_0x12cb
 
4
CUS_0x6df3
 
4
CUS_0xb56b
 
4
CUS_0x3032
 
4
Other values (12495)
49980 

Length

Max length10
Median length10
Mean length9.93952
Min length9

Characters and Unicode

Total characters496976
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCUS_0xd40
2nd rowCUS_0xd40
3rd rowCUS_0xd40
4th rowCUS_0xd40
5th rowCUS_0x21b1

Common Values

ValueCountFrequency (%)
CUS_0x24254
 
< 0.1%
CUS_0x12cb4
 
< 0.1%
CUS_0x6df34
 
< 0.1%
CUS_0xb56b4
 
< 0.1%
CUS_0x30324
 
< 0.1%
CUS_0x38704
 
< 0.1%
CUS_0x1bdf4
 
< 0.1%
CUS_0x1fdc4
 
< 0.1%
CUS_0x24cf4
 
< 0.1%
CUS_0x91924
 
< 0.1%
Other values (12490)49960
99.9%

Length

2022-11-29T15:45:39.495068image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
cus_0x24254
 
< 0.1%
cus_0x24994
 
< 0.1%
cus_0x4d034
 
< 0.1%
cus_0x3c6a4
 
< 0.1%
cus_0x143c4
 
< 0.1%
cus_0x2a694
 
< 0.1%
cus_0x47974
 
< 0.1%
cus_0x3d504
 
< 0.1%
cus_0x3fb4
 
< 0.1%
cus_0x997e4
 
< 0.1%
Other values (12490)49960
99.9%

Most occurring characters

ValueCountFrequency (%)
059124
11.9%
C50000
 
10.1%
S50000
 
10.1%
_50000
 
10.1%
x50000
 
10.1%
U50000
 
10.1%
414000
 
2.8%
613700
 
2.8%
513600
 
2.7%
313536
 
2.7%
Other values (11)133016
26.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number180604
36.3%
Uppercase Letter150000
30.2%
Lowercase Letter116372
23.4%
Connector Punctuation50000
 
10.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
059124
32.7%
414000
 
7.8%
613700
 
7.6%
513600
 
7.5%
313536
 
7.5%
813536
 
7.5%
713388
 
7.4%
913368
 
7.4%
213360
 
7.4%
112992
 
7.2%
Lowercase Letter
ValueCountFrequency (%)
x50000
43.0%
b13400
 
11.5%
a13272
 
11.4%
c11136
 
9.6%
e9744
 
8.4%
d9436
 
8.1%
f9384
 
8.1%
Uppercase Letter
ValueCountFrequency (%)
C50000
33.3%
S50000
33.3%
U50000
33.3%
Connector Punctuation
ValueCountFrequency (%)
_50000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin266372
53.6%
Common230604
46.4%

Most frequent character per script

Common
ValueCountFrequency (%)
059124
25.6%
_50000
21.7%
414000
 
6.1%
613700
 
5.9%
513600
 
5.9%
313536
 
5.9%
813536
 
5.9%
713388
 
5.8%
913368
 
5.8%
213360
 
5.8%
Latin
ValueCountFrequency (%)
C50000
18.8%
S50000
18.8%
x50000
18.8%
U50000
18.8%
b13400
 
5.0%
a13272
 
5.0%
c11136
 
4.2%
e9744
 
3.7%
d9436
 
3.5%
f9384
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII496976
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
059124
11.9%
C50000
 
10.1%
S50000
 
10.1%
_50000
 
10.1%
x50000
 
10.1%
U50000
 
10.1%
414000
 
2.8%
613700
 
2.8%
513600
 
2.7%
313536
 
2.7%
Other values (11)133016
26.8%

Month
Categorical

UNIFORM

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
September
12500 
October
12500 
December
12500 
November
12500 

Length

Max length9
Median length8.5
Mean length8
Min length7

Characters and Unicode

Total characters400000
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSeptember
2nd rowOctober
3rd rowNovember
4th rowDecember
5th rowSeptember

Common Values

ValueCountFrequency (%)
September12500
25.0%
October12500
25.0%
December12500
25.0%
November12500
25.0%

Length

2022-11-29T15:45:39.649435image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-29T15:45:39.825017image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
september12500
25.0%
october12500
25.0%
december12500
25.0%
november12500
25.0%

Most occurring characters

ValueCountFrequency (%)
e112500
28.1%
b50000
12.5%
r50000
12.5%
m37500
 
9.4%
t25000
 
6.2%
c25000
 
6.2%
o25000
 
6.2%
S12500
 
3.1%
p12500
 
3.1%
O12500
 
3.1%
Other values (3)37500
 
9.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter350000
87.5%
Uppercase Letter50000
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e112500
32.1%
b50000
14.3%
r50000
14.3%
m37500
 
10.7%
t25000
 
7.1%
c25000
 
7.1%
o25000
 
7.1%
p12500
 
3.6%
v12500
 
3.6%
Uppercase Letter
ValueCountFrequency (%)
S12500
25.0%
O12500
25.0%
D12500
25.0%
N12500
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin400000
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e112500
28.1%
b50000
12.5%
r50000
12.5%
m37500
 
9.4%
t25000
 
6.2%
c25000
 
6.2%
o25000
 
6.2%
S12500
 
3.1%
p12500
 
3.1%
O12500
 
3.1%
Other values (3)37500
 
9.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII400000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e112500
28.1%
b50000
12.5%
r50000
12.5%
m37500
 
9.4%
t25000
 
6.2%
c25000
 
6.2%
o25000
 
6.2%
S12500
 
3.1%
p12500
 
3.1%
O12500
 
3.1%
Other values (3)37500
 
9.4%

Name
Categorical

HIGH CARDINALITY
MISSING

Distinct10139
Distinct (%)22.5%
Missing5015
Missing (%)10.0%
Memory size390.8 KiB
Stevex
 
22
Langep
 
21
Deepa Seetharamanm
 
20
Ronald Groverk
 
20
Jessicad
 
20
Other values (10134)
44882 

Length

Max length25
Median length20
Mean length9.758274981
Min length2

Characters and Unicode

Total characters438976
Distinct characters57
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27 ?
Unique (%)0.1%

Sample

1st rowAaron Maashoh
2nd rowAaron Maashoh
3rd rowAaron Maashoh
4th rowAaron Maashoh
5th rowRick Rothackerj

Common Values

ValueCountFrequency (%)
Stevex22
 
< 0.1%
Langep21
 
< 0.1%
Deepa Seetharamanm20
 
< 0.1%
Ronald Groverk20
 
< 0.1%
Jessicad20
 
< 0.1%
Raymondr20
 
< 0.1%
Nicko20
 
< 0.1%
Vaughanl19
 
< 0.1%
Jonesb19
 
< 0.1%
Jessica Wohlt19
 
< 0.1%
Other values (10129)44785
89.6%
(Missing)5015
 
10.0%

Length

2022-11-29T15:45:39.989322image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
david328
 
0.5%
jonathan300
 
0.5%
jessica256
 
0.4%
sarah212
 
0.3%
karen190
 
0.3%
nick184
 
0.3%
tim184
 
0.3%
caroline181
 
0.3%
tom174
 
0.3%
john169
 
0.3%
Other values (9720)60616
96.5%

Most occurring characters

ValueCountFrequency (%)
a45799
 
10.4%
e38101
 
8.7%
n29539
 
6.7%
i29158
 
6.6%
r27144
 
6.2%
o22261
 
5.1%
l21086
 
4.8%
17825
 
4.1%
t17446
 
4.0%
h15228
 
3.5%
Other values (47)175389
40.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter356743
81.3%
Uppercase Letter62618
 
14.3%
Space Separator17825
 
4.1%
Other Punctuation1075
 
0.2%
Dash Punctuation715
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a45799
12.8%
e38101
 
10.7%
n29539
 
8.3%
i29158
 
8.2%
r27144
 
7.6%
o22261
 
6.2%
l21086
 
5.9%
t17446
 
4.9%
h15228
 
4.3%
s15226
 
4.3%
Other values (16)95755
26.8%
Uppercase Letter
ValueCountFrequency (%)
S7100
 
11.3%
A4370
 
7.0%
M4314
 
6.9%
L4206
 
6.7%
J4014
 
6.4%
C3883
 
6.2%
R3587
 
5.7%
D3511
 
5.6%
K3447
 
5.5%
B3251
 
5.2%
Other values (16)20935
33.4%
Other Punctuation
ValueCountFrequency (%)
.551
51.3%
"478
44.5%
,46
 
4.3%
Space Separator
ValueCountFrequency (%)
17825
100.0%
Dash Punctuation
ValueCountFrequency (%)
-715
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin419361
95.5%
Common19615
 
4.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a45799
 
10.9%
e38101
 
9.1%
n29539
 
7.0%
i29158
 
7.0%
r27144
 
6.5%
o22261
 
5.3%
l21086
 
5.0%
t17446
 
4.2%
h15228
 
3.6%
s15226
 
3.6%
Other values (42)158373
37.8%
Common
ValueCountFrequency (%)
17825
90.9%
-715
 
3.6%
.551
 
2.8%
"478
 
2.4%
,46
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII438976
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a45799
 
10.4%
e38101
 
8.7%
n29539
 
6.7%
i29158
 
6.6%
r27144
 
6.2%
o22261
 
5.1%
l21086
 
4.8%
17825
 
4.1%
t17446
 
4.0%
h15228
 
3.5%
Other values (47)175389
40.0%

Age
Categorical

HIGH CARDINALITY

Distinct976
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
39
 
1493
32
 
1440
44
 
1428
22
 
1422
35
 
1414
Other values (971)
42803 

Length

Max length5
Median length2
Mean length2.10342
Min length2

Characters and Unicode

Total characters105171
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique839 ?
Unique (%)1.7%

Sample

1st row23
2nd row24
3rd row24
4th row24_
5th row28

Common Values

ValueCountFrequency (%)
391493
 
3.0%
321440
 
2.9%
441428
 
2.9%
221422
 
2.8%
351414
 
2.8%
371397
 
2.8%
271382
 
2.8%
201374
 
2.7%
291368
 
2.7%
261348
 
2.7%
Other values (966)35934
71.9%

Length

2022-11-29T15:45:40.151801image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
391570
 
3.1%
321529
 
3.1%
441500
 
3.0%
221493
 
3.0%
351483
 
3.0%
371461
 
2.9%
271457
 
2.9%
291441
 
2.9%
201432
 
2.9%
261421
 
2.8%
Other values (918)35213
70.4%

Most occurring characters

ValueCountFrequency (%)
219446
18.5%
319144
18.2%
416616
15.8%
510984
10.4%
19785
9.3%
05994
 
5.7%
65690
 
5.4%
95305
 
5.0%
74704
 
4.5%
84562
 
4.3%
Other values (2)2941
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number102230
97.2%
Connector Punctuation2477
 
2.4%
Dash Punctuation464
 
0.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
219446
19.0%
319144
18.7%
416616
16.3%
510984
10.7%
19785
9.6%
05994
 
5.9%
65690
 
5.6%
95305
 
5.2%
74704
 
4.6%
84562
 
4.5%
Connector Punctuation
ValueCountFrequency (%)
_2477
100.0%
Dash Punctuation
ValueCountFrequency (%)
-464
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common105171
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
219446
18.5%
319144
18.2%
416616
15.8%
510984
10.4%
19785
9.3%
05994
 
5.7%
65690
 
5.4%
95305
 
5.0%
74704
 
4.5%
84562
 
4.3%
Other values (2)2941
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII105171
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
219446
18.5%
319144
18.2%
416616
15.8%
510984
10.4%
19785
9.3%
05994
 
5.7%
65690
 
5.4%
95305
 
5.0%
74704
 
4.5%
84562
 
4.3%
Other values (2)2941
 
2.8%

SSN
Categorical

HIGH CARDINALITY

Distinct12501
Distinct (%)25.0%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
#F%$D@*&8
 
2828
605-63-9678
 
4
500-54-3583
 
4
027-69-5774
 
4
716-63-9191
 
4
Other values (12496)
47156 

Length

Max length11
Median length11
Mean length10.88688
Min length9

Characters and Unicode

Total characters544344
Distinct characters19
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st row821-00-0265
2nd row821-00-0265
3rd row821-00-0265
4th row821-00-0265
5th row004-07-5839

Common Values

ValueCountFrequency (%)
#F%$D@*&82828
 
5.7%
605-63-96784
 
< 0.1%
500-54-35834
 
< 0.1%
027-69-57744
 
< 0.1%
716-63-91914
 
< 0.1%
423-96-91154
 
< 0.1%
885-50-01084
 
< 0.1%
268-75-54544
 
< 0.1%
226-58-16204
 
< 0.1%
973-70-90644
 
< 0.1%
Other values (12491)47136
94.3%

Length

2022-11-29T15:45:40.309123image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
f%$d@*&82828
 
5.7%
999-28-94834
 
< 0.1%
249-63-85544
 
< 0.1%
103-20-76164
 
< 0.1%
956-23-38604
 
< 0.1%
350-85-76034
 
< 0.1%
989-07-37704
 
< 0.1%
300-57-87864
 
< 0.1%
605-53-69074
 
< 0.1%
187-58-88434
 
< 0.1%
Other values (12491)47136
94.3%

Most occurring characters

ValueCountFrequency (%)
-94344
17.3%
845742
8.4%
143235
7.9%
442874
7.9%
242687
7.8%
742591
7.8%
942261
7.8%
042219
7.8%
542183
7.7%
341862
7.7%
Other values (9)64346
11.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number427376
78.5%
Dash Punctuation94344
 
17.3%
Other Punctuation14140
 
2.6%
Uppercase Letter5656
 
1.0%
Currency Symbol2828
 
0.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
845742
10.7%
143235
10.1%
442874
10.0%
242687
10.0%
742591
10.0%
942261
9.9%
042219
9.9%
542183
9.9%
341862
9.8%
641722
9.8%
Other Punctuation
ValueCountFrequency (%)
&2828
20.0%
*2828
20.0%
@2828
20.0%
%2828
20.0%
#2828
20.0%
Uppercase Letter
ValueCountFrequency (%)
F2828
50.0%
D2828
50.0%
Dash Punctuation
ValueCountFrequency (%)
-94344
100.0%
Currency Symbol
ValueCountFrequency (%)
$2828
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common538688
99.0%
Latin5656
 
1.0%

Most frequent character per script

Common
ValueCountFrequency (%)
-94344
17.5%
845742
8.5%
143235
8.0%
442874
8.0%
242687
7.9%
742591
7.9%
942261
7.8%
042219
7.8%
542183
7.8%
341862
7.8%
Other values (7)58690
10.9%
Latin
ValueCountFrequency (%)
F2828
50.0%
D2828
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII544344
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
-94344
17.3%
845742
8.4%
143235
7.9%
442874
7.9%
242687
7.8%
742591
7.8%
942261
7.8%
042219
7.8%
542183
7.7%
341862
7.7%
Other values (9)64346
11.8%

Occupation
Categorical

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
_______
3438 
Lawyer
 
3324
Engineer
 
3212
Architect
 
3195
Mechanic
 
3168
Other values (11)
33663 

Length

Max length13
Median length10
Mean length8.43476
Min length6

Characters and Unicode

Total characters421738
Distinct characters28
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowScientist
2nd rowScientist
3rd rowScientist
4th rowScientist
5th row_______

Common Values

ValueCountFrequency (%)
_______3438
 
6.9%
Lawyer3324
 
6.6%
Engineer3212
 
6.4%
Architect3195
 
6.4%
Mechanic3168
 
6.3%
Developer3146
 
6.3%
Accountant3133
 
6.3%
Media_Manager3130
 
6.3%
Scientist3104
 
6.2%
Teacher3103
 
6.2%
Other values (6)18047
36.1%

Length

2022-11-29T15:45:40.479004image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
3438
 
6.9%
lawyer3324
 
6.6%
engineer3212
 
6.4%
architect3195
 
6.4%
mechanic3168
 
6.3%
developer3146
 
6.3%
accountant3133
 
6.3%
media_manager3130
 
6.3%
scientist3104
 
6.2%
teacher3103
 
6.2%
Other values (6)18047
36.1%

Most occurring characters

ValueCountFrequency (%)
e56361
13.4%
r43349
10.3%
n37282
 
8.8%
a34102
 
8.1%
c31173
 
7.4%
t30964
 
7.3%
i30777
 
7.3%
_27196
 
6.4%
M15375
 
3.6%
o15370
 
3.6%
Other values (18)99789
23.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter344850
81.8%
Uppercase Letter49692
 
11.8%
Connector Punctuation27196
 
6.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e56361
16.3%
r43349
12.6%
n37282
10.8%
a34102
9.9%
c31173
9.0%
t30964
9.0%
i30777
8.9%
o15370
 
4.5%
u12220
 
3.5%
h9466
 
2.7%
Other values (8)43786
12.7%
Uppercase Letter
ValueCountFrequency (%)
M15375
30.9%
A6328
12.7%
E6315
12.7%
D6173
12.4%
L3324
 
6.7%
S3104
 
6.2%
T3103
 
6.2%
J3037
 
6.1%
W2933
 
5.9%
Connector Punctuation
ValueCountFrequency (%)
_27196
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin394542
93.6%
Common27196
 
6.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e56361
14.3%
r43349
11.0%
n37282
9.4%
a34102
 
8.6%
c31173
 
7.9%
t30964
 
7.8%
i30777
 
7.8%
M15375
 
3.9%
o15370
 
3.9%
u12220
 
3.1%
Other values (17)87569
22.2%
Common
ValueCountFrequency (%)
_27196
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII421738
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e56361
13.4%
r43349
10.3%
n37282
 
8.8%
a34102
 
8.1%
c31173
 
7.4%
t30964
 
7.3%
i30777
 
7.3%
_27196
 
6.4%
M15375
 
3.6%
o15370
 
3.6%
Other values (18)99789
23.7%

Annual_Income
Categorical

HIGH CARDINALITY
UNIFORM

Distinct16121
Distinct (%)32.2%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
9141.63
 
8
95596.35
 
8
72524.2
 
8
36585.12
 
8
22434.16
 
8
Other values (16116)
49960 

Length

Max length19
Median length8
Mean length8.3094
Min length6

Characters and Unicode

Total characters415470
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3335 ?
Unique (%)6.7%

Sample

1st row19114.12
2nd row19114.12
3rd row19114.12
4th row19114.12
5th row34847.84

Common Values

ValueCountFrequency (%)
9141.638
 
< 0.1%
95596.358
 
< 0.1%
72524.28
 
< 0.1%
36585.128
 
< 0.1%
22434.168
 
< 0.1%
17816.758
 
< 0.1%
109945.328
 
< 0.1%
20867.677
 
< 0.1%
40341.167
 
< 0.1%
33029.667
 
< 0.1%
Other values (16111)49923
99.8%

Length

2022-11-29T15:45:40.651805image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
9141.638
 
< 0.1%
20867.678
 
< 0.1%
95596.358
 
< 0.1%
32543.388
 
< 0.1%
33029.668
 
< 0.1%
40341.168
 
< 0.1%
17273.838
 
< 0.1%
109945.328
 
< 0.1%
17816.758
 
< 0.1%
22434.168
 
< 0.1%
Other values (12979)49920
99.8%

Most occurring characters

ValueCountFrequency (%)
.50000
12.0%
146376
11.2%
238186
9.2%
435878
8.6%
335800
8.6%
835462
8.5%
535357
8.5%
635326
8.5%
934524
8.3%
033173
8.0%
Other values (2)35388
8.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number361950
87.1%
Other Punctuation50000
 
12.0%
Connector Punctuation3520
 
0.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
146376
12.8%
238186
10.6%
435878
9.9%
335800
9.9%
835462
9.8%
535357
9.8%
635326
9.8%
934524
9.5%
033173
9.2%
731868
8.8%
Other Punctuation
ValueCountFrequency (%)
.50000
100.0%
Connector Punctuation
ValueCountFrequency (%)
_3520
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common415470
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.50000
12.0%
146376
11.2%
238186
9.2%
435878
8.6%
335800
8.6%
835462
8.5%
535357
8.5%
635326
8.5%
934524
8.3%
033173
8.0%
Other values (2)35388
8.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII415470
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.50000
12.0%
146376
11.2%
238186
9.2%
435878
8.6%
335800
8.6%
835462
8.5%
535357
8.5%
635326
8.5%
934524
8.3%
033173
8.0%
Other values (2)35388
8.5%

Monthly_Inhand_Salary
Real number (ℝ≥0)

MISSING

Distinct12793
Distinct (%)30.1%
Missing7498
Missing (%)15.0%
Infinite0
Infinite (%)0.0%
Mean4182.004291
Minimum303.6454167
Maximum15204.63333
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2022-11-29T15:45:40.827135image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum303.6454167
5-th percentile833.7731458
Q11625.188333
median3086.305
Q35934.189094
95-th percentile10771.23333
Maximum15204.63333
Range14900.98792
Interquartile range (IQR)4309.00076

Descriptive statistics

Standard deviation3174.109304
Coefficient of variation (CV)0.7589923594
Kurtosis0.6280610902
Mean4182.004291
Median Absolute Deviation (MAD)1750.829583
Skewness1.131373974
Sum177743546.4
Variance10074969.87
MonotonicityNot monotonic
2022-11-29T15:45:41.025088image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1315.5608338
 
< 0.1%
4387.27257
 
< 0.1%
3080.5557
 
< 0.1%
5766.4916677
 
< 0.1%
6082.18757
 
< 0.1%
2295.0583337
 
< 0.1%
6639.567
 
< 0.1%
536.431257
 
< 0.1%
6358.9566676
 
< 0.1%
10511.334
 
< 0.1%
Other values (12783)42435
84.9%
(Missing)7498
 
15.0%
ValueCountFrequency (%)
303.64541672
< 0.1%
319.556254
< 0.1%
331.03192332
< 0.1%
332.12833333
< 0.1%
332.431254
< 0.1%
333.59666674
< 0.1%
355.20833334
< 0.1%
357.25583334
< 0.1%
358.05833334
< 0.1%
361.60333334
< 0.1%
ValueCountFrequency (%)
15204.633333
< 0.1%
15167.184
< 0.1%
15136.696673
< 0.1%
15115.193
< 0.1%
15101.943
< 0.1%
15090.076674
< 0.1%
15066.783334
< 0.1%
14978.336673
< 0.1%
14960.251
 
< 0.1%
14929.543
< 0.1%

Num_Bank_Accounts
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct540
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.83826
Minimum-1
Maximum1798
Zeros2166
Zeros (%)4.3%
Negative16
Negative (%)< 0.1%
Memory size390.8 KiB
2022-11-29T15:45:41.240633image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile1
Q13
median6
Q37
95-th percentile10
Maximum1798
Range1799
Interquartile range (IQR)4

Descriptive statistics

Standard deviation116.3968482
Coefficient of variation (CV)6.912641103
Kurtosis132.919184
Mean16.83826
Median Absolute Deviation (MAD)2
Skewness11.25168183
Sum841913
Variance13548.22626
MonotonicityNot monotonic
2022-11-29T15:45:41.434194image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
66504
13.0%
76408
12.8%
86387
12.8%
46100
12.2%
56068
12.1%
35955
11.9%
92738
5.5%
102599
 
5.2%
12253
 
4.5%
02166
 
4.3%
Other values (530)2822
5.6%
ValueCountFrequency (%)
-116
 
< 0.1%
02166
 
4.3%
12253
 
4.5%
22152
 
4.3%
35955
11.9%
46100
12.2%
56068
12.1%
66504
13.0%
76408
12.8%
86387
12.8%
ValueCountFrequency (%)
17981
< 0.1%
17831
< 0.1%
17811
< 0.1%
17801
< 0.1%
17751
< 0.1%
17742
< 0.1%
17731
< 0.1%
17721
< 0.1%
17711
< 0.1%
17701
< 0.1%

Num_Credit_Card
Real number (ℝ≥0)

Distinct819
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22.92148
Minimum0
Maximum1499
Zeros16
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2022-11-29T15:45:41.640167image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q14
median5
Q37
95-th percentile10
Maximum1499
Range1499
Interquartile range (IQR)3

Descriptive statistics

Standard deviation129.3148043
Coefficient of variation (CV)5.641642872
Kurtosis71.87065897
Mean22.92148
Median Absolute Deviation (MAD)2
Skewness8.286879673
Sum1146074
Variance16722.3186
MonotonicityNot monotonic
2022-11-29T15:45:41.829809image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
59210
18.4%
78271
16.5%
68243
16.5%
47072
14.1%
36539
13.1%
82497
 
5.0%
102405
 
4.8%
92333
 
4.7%
21131
 
2.3%
11063
 
2.1%
Other values (809)1236
 
2.5%
ValueCountFrequency (%)
016
 
< 0.1%
11063
 
2.1%
21131
 
2.3%
36539
13.1%
47072
14.1%
59210
18.4%
68243
16.5%
78271
16.5%
82497
 
5.0%
92333
 
4.7%
ValueCountFrequency (%)
14991
 
< 0.1%
14982
< 0.1%
14951
 
< 0.1%
14911
 
< 0.1%
14881
 
< 0.1%
14861
 
< 0.1%
14851
 
< 0.1%
14842
< 0.1%
14812
< 0.1%
14743
< 0.1%

Interest_Rate
Real number (ℝ≥0)

HIGH CORRELATION

Distinct945
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean68.77264
Minimum1
Maximum5799
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2022-11-29T15:45:42.032459image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q18
median13
Q320
95-th percentile32
Maximum5799
Range5798
Interquartile range (IQR)12

Descriptive statistics

Standard deviation451.6023629
Coefficient of variation (CV)6.566599202
Kurtosis92.48656451
Mean68.77264
Median Absolute Deviation (MAD)6
Skewness9.370223147
Sum3438632
Variance203944.6942
MonotonicityNot monotonic
2022-11-29T15:45:42.222379image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
82503
 
5.0%
52500
 
5.0%
62368
 
4.7%
122288
 
4.6%
102259
 
4.5%
92253
 
4.5%
72250
 
4.5%
112198
 
4.4%
182052
 
4.1%
151992
 
4.0%
Other values (935)27337
54.7%
ValueCountFrequency (%)
11344
2.7%
21245
2.5%
31388
2.8%
41287
2.6%
52500
5.0%
62368
4.7%
72250
4.5%
82503
5.0%
92253
4.5%
102259
4.5%
ValueCountFrequency (%)
57991
< 0.1%
57921
< 0.1%
57731
< 0.1%
57592
< 0.1%
57521
< 0.1%
57481
< 0.1%
57471
< 0.1%
57431
< 0.1%
57361
< 0.1%
57321
< 0.1%

Num_of_Loan
Categorical

HIGH CARDINALITY

Distinct263
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
2
7173 
3
7114 
4
6982 
0
5163 
1
5029 
Other values (258)
18539 

Length

Max length5
Median length1
Mean length1.17906
Min length1

Characters and Unicode

Total characters58953
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique226 ?
Unique (%)0.5%

Sample

1st row4
2nd row4
3rd row4
4th row4
5th row1

Common Values

ValueCountFrequency (%)
27173
14.3%
37114
14.2%
46982
14.0%
05163
10.3%
15029
10.1%
63707
7.4%
73483
7.0%
53437
6.9%
-1001974
 
3.9%
91746
 
3.5%
Other values (253)4192
8.4%

Length

2022-11-29T15:45:42.406768image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
27515
15.0%
37514
15.0%
47368
14.7%
05446
10.9%
15295
10.6%
63902
7.8%
73680
7.4%
53617
7.2%
1001974
 
3.9%
91837
 
3.7%
Other values (242)1852
 
3.7%

Most occurring characters

ValueCountFrequency (%)
09458
16.0%
27597
12.9%
37597
12.9%
47476
12.7%
17438
12.6%
63976
6.7%
73742
 
6.3%
53696
 
6.3%
_2436
 
4.1%
-1974
 
3.3%
Other values (2)3563
 
6.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number54543
92.5%
Connector Punctuation2436
 
4.1%
Dash Punctuation1974
 
3.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
09458
17.3%
27597
13.9%
37597
13.9%
47476
13.7%
17438
13.6%
63976
7.3%
73742
 
6.9%
53696
 
6.8%
91908
 
3.5%
81655
 
3.0%
Connector Punctuation
ValueCountFrequency (%)
_2436
100.0%
Dash Punctuation
ValueCountFrequency (%)
-1974
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common58953
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
09458
16.0%
27597
12.9%
37597
12.9%
47476
12.7%
17438
12.6%
63976
6.7%
73742
 
6.3%
53696
 
6.3%
_2436
 
4.1%
-1974
 
3.3%
Other values (2)3563
 
6.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII58953
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
09458
16.0%
27597
12.9%
37597
12.9%
47476
12.7%
17438
12.6%
63976
6.7%
73742
 
6.3%
53696
 
6.3%
_2436
 
4.1%
-1974
 
3.3%
Other values (2)3563
 
6.0%

Type_of_Loan
Categorical

HIGH CARDINALITY
MISSING

Distinct6260
Distinct (%)14.1%
Missing5704
Missing (%)11.4%
Memory size390.8 KiB
Not Specified
 
704
Credit-Builder Loan
 
640
Personal Loan
 
636
Debt Consolidation Loan
 
632
Student Loan
 
620
Other values (6255)
41064 

Length

Max length182
Median length142
Mean length66.68358317
Min length9

Characters and Unicode

Total characters2953816
Distinct characters33
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAuto Loan, Credit-Builder Loan, Personal Loan, and Home Equity Loan
2nd rowAuto Loan, Credit-Builder Loan, Personal Loan, and Home Equity Loan
3rd rowAuto Loan, Credit-Builder Loan, Personal Loan, and Home Equity Loan
4th rowAuto Loan, Credit-Builder Loan, Personal Loan, and Home Equity Loan
5th rowCredit-Builder Loan

Common Values

ValueCountFrequency (%)
Not Specified704
 
1.4%
Credit-Builder Loan640
 
1.3%
Personal Loan636
 
1.3%
Debt Consolidation Loan632
 
1.3%
Student Loan620
 
1.2%
Payday Loan600
 
1.2%
Mortgage Loan588
 
1.2%
Auto Loan576
 
1.2%
Home Equity Loan568
 
1.1%
Personal Loan, and Student Loan160
 
0.3%
Other values (6250)38572
77.1%
(Missing)5704
 
11.4%

Length

2022-11-29T15:45:42.605458image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
loan156836
36.4%
and38732
 
9.0%
payday20284
 
4.7%
credit-builder20220
 
4.7%
not19808
 
4.6%
specified19808
 
4.6%
home19552
 
4.5%
equity19552
 
4.5%
student19484
 
4.5%
mortgage19468
 
4.5%
Other values (4)77216
17.9%

Most occurring characters

ValueCountFrequency (%)
386664
13.1%
o312268
10.6%
a294436
 
10.0%
n273272
 
9.3%
e177392
 
6.0%
t175788
 
6.0%
d158136
 
5.4%
L156836
 
5.3%
i138384
 
4.7%
,132348
 
4.5%
Other values (23)748292
25.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2002136
67.8%
Uppercase Letter412448
 
14.0%
Space Separator386664
 
13.1%
Other Punctuation132348
 
4.5%
Dash Punctuation20220
 
0.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o312268
15.6%
a294436
14.7%
n273272
13.6%
e177392
8.9%
t175788
8.8%
d158136
7.9%
i138384
6.9%
r79352
 
4.0%
u78252
 
3.9%
y60120
 
3.0%
Other values (9)254736
12.7%
Uppercase Letter
ValueCountFrequency (%)
L156836
38.0%
P39728
 
9.6%
C39608
 
9.6%
S39292
 
9.5%
B20220
 
4.9%
N19808
 
4.8%
H19552
 
4.7%
E19552
 
4.7%
M19468
 
4.7%
D19388
 
4.7%
Space Separator
ValueCountFrequency (%)
386664
100.0%
Other Punctuation
ValueCountFrequency (%)
,132348
100.0%
Dash Punctuation
ValueCountFrequency (%)
-20220
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2414584
81.7%
Common539232
 
18.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
o312268
12.9%
a294436
12.2%
n273272
11.3%
e177392
 
7.3%
t175788
 
7.3%
d158136
 
6.5%
L156836
 
6.5%
i138384
 
5.7%
r79352
 
3.3%
u78252
 
3.2%
Other values (20)570468
23.6%
Common
ValueCountFrequency (%)
386664
71.7%
,132348
 
24.5%
-20220
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII2953816
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
386664
13.1%
o312268
10.6%
a294436
 
10.0%
n273272
 
9.3%
e177392
 
6.0%
t175788
 
6.0%
d158136
 
5.4%
L156836
 
5.3%
i138384
 
4.7%
,132348
 
4.5%
Other values (23)748292
25.3%

Delay_from_due_date
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct73
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21.05264
Minimum-5
Maximum67
Zeros626
Zeros (%)1.3%
Negative298
Negative (%)0.6%
Memory size390.8 KiB
2022-11-29T15:45:42.817416image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum-5
5-th percentile3
Q110
median18
Q328
95-th percentile54
Maximum67
Range72
Interquartile range (IQR)18

Descriptive statistics

Standard deviation14.86039722
Coefficient of variation (CV)0.7058685858
Kurtosis0.3444273989
Mean21.05264
Median Absolute Deviation (MAD)9
Skewness0.9649281101
Sum1052632
Variance220.8314057
MonotonicityNot monotonic
2022-11-29T15:45:43.158099image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
131761
 
3.5%
151759
 
3.5%
81680
 
3.4%
91656
 
3.3%
101645
 
3.3%
141636
 
3.3%
121625
 
3.2%
71587
 
3.2%
61584
 
3.2%
111573
 
3.1%
Other values (63)33494
67.0%
ValueCountFrequency (%)
-518
 
< 0.1%
-449
 
0.1%
-359
 
0.1%
-271
 
0.1%
-1101
 
0.2%
0626
1.3%
1668
1.3%
2669
1.3%
3848
1.7%
4825
1.7%
ValueCountFrequency (%)
677
 
< 0.1%
6612
 
< 0.1%
6530
 
0.1%
6433
 
0.1%
6321
 
< 0.1%
62279
0.6%
61271
0.5%
60259
0.5%
59250
0.5%
58282
0.6%

Num_of_Delayed_Payment
Categorical

HIGH CARDINALITY
MISSING

Distinct443
Distinct (%)1.0%
Missing3498
Missing (%)7.0%
Memory size390.8 KiB
19
 
2622
15
 
2594
18
 
2570
16
 
2548
17
 
2545
Other values (438)
33623 

Length

Max length5
Median length2
Mean length1.772912993
Min length1

Characters and Unicode

Total characters82444
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique367 ?
Unique (%)0.8%

Sample

1st row7
2nd row9
3rd row4
4th row5
5th row1

Common Values

ValueCountFrequency (%)
192622
 
5.2%
152594
 
5.2%
182570
 
5.1%
162548
 
5.1%
172545
 
5.1%
102517
 
5.0%
122483
 
5.0%
112440
 
4.9%
202422
 
4.8%
92365
 
4.7%
Other values (433)21396
42.8%
(Missing)3498
 
7.0%

Length

2022-11-29T15:45:43.349386image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
192707
 
5.8%
152674
 
5.8%
162637
 
5.7%
172636
 
5.7%
182631
 
5.7%
102591
 
5.6%
122563
 
5.5%
202518
 
5.4%
112504
 
5.4%
92440
 
5.2%
Other values (398)20601
44.3%

Most occurring characters

ValueCountFrequency (%)
130118
36.5%
213020
15.8%
06032
 
7.3%
95262
 
6.4%
85260
 
6.4%
54698
 
5.7%
34317
 
5.2%
74033
 
4.9%
64015
 
4.9%
43975
 
4.8%
Other values (2)1714
 
2.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number80730
97.9%
Connector Punctuation1427
 
1.7%
Dash Punctuation287
 
0.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
130118
37.3%
213020
16.1%
06032
 
7.5%
95262
 
6.5%
85260
 
6.5%
54698
 
5.8%
34317
 
5.3%
74033
 
5.0%
64015
 
5.0%
43975
 
4.9%
Connector Punctuation
ValueCountFrequency (%)
_1427
100.0%
Dash Punctuation
ValueCountFrequency (%)
-287
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common82444
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
130118
36.5%
213020
15.8%
06032
 
7.3%
95262
 
6.4%
85260
 
6.4%
54698
 
5.7%
34317
 
5.2%
74033
 
4.9%
64015
 
4.9%
43975
 
4.8%
Other values (2)1714
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII82444
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
130118
36.5%
213020
15.8%
06032
 
7.3%
95262
 
6.4%
85260
 
6.4%
54698
 
5.7%
34317
 
5.2%
74033
 
4.9%
64015
 
4.9%
43975
 
4.8%
Other values (2)1714
 
2.1%

Changed_Credit_Limit
Categorical

HIGH CARDINALITY

Distinct3927
Distinct (%)7.9%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
_
 
1059
11.5
 
70
11.32
 
63
7.01
 
60
7.35
 
60
Other values (3922)
48688 

Length

Max length21
Median length20
Mean length4.69558
Min length1

Characters and Unicode

Total characters234779
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique695 ?
Unique (%)1.4%

Sample

1st row11.27
2nd row13.27
3rd row12.27
4th row11.27
5th row5.42

Common Values

ValueCountFrequency (%)
_1059
 
2.1%
11.570
 
0.1%
11.3263
 
0.1%
7.0160
 
0.1%
7.3560
 
0.1%
10.0657
 
0.1%
8.2256
 
0.1%
7.6356
 
0.1%
7.6956
 
0.1%
10.355
 
0.1%
Other values (3917)48408
96.8%

Length

2022-11-29T15:45:43.521886image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1059
 
2.1%
11.570
 
0.1%
11.3263
 
0.1%
7.0160
 
0.1%
7.3560
 
0.1%
3.9357
 
0.1%
10.0657
 
0.1%
8.2256
 
0.1%
7.6356
 
0.1%
7.6956
 
0.1%
Other values (3471)48406
96.8%

Most occurring characters

ValueCountFrequency (%)
.48941
20.8%
134489
14.7%
923001
9.8%
019861
8.5%
218122
 
7.7%
715298
 
6.5%
815257
 
6.5%
514815
 
6.3%
614574
 
6.2%
314319
 
6.1%
Other values (3)16102
 
6.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number183944
78.3%
Other Punctuation48941
 
20.8%
Connector Punctuation1059
 
0.5%
Dash Punctuation835
 
0.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
134489
18.7%
923001
12.5%
019861
10.8%
218122
9.9%
715298
8.3%
815257
8.3%
514815
8.1%
614574
7.9%
314319
7.8%
414208
7.7%
Other Punctuation
ValueCountFrequency (%)
.48941
100.0%
Connector Punctuation
ValueCountFrequency (%)
_1059
100.0%
Dash Punctuation
ValueCountFrequency (%)
-835
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common234779
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.48941
20.8%
134489
14.7%
923001
9.8%
019861
8.5%
218122
 
7.7%
715298
 
6.5%
815257
 
6.5%
514815
 
6.3%
614574
 
6.2%
314319
 
6.1%
Other values (3)16102
 
6.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII234779
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.48941
20.8%
134489
14.7%
923001
9.8%
019861
8.5%
218122
 
7.7%
715298
 
6.5%
815257
 
6.5%
514815
 
6.3%
614574
 
6.2%
314319
 
6.1%
Other values (3)16102
 
6.9%

Num_Credit_Inquiries
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
ZEROS

Distinct750
Distinct (%)1.5%
Missing1035
Missing (%)2.1%
Infinite0
Infinite (%)0.0%
Mean30.08020014
Minimum0
Maximum2593
Zeros1102
Zeros (%)2.2%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2022-11-29T15:45:43.703499image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q14
median7
Q310
95-th percentile15
Maximum2593
Range2593
Interquartile range (IQR)6

Descriptive statistics

Standard deviation196.9841205
Coefficient of variation (CV)6.548630646
Kurtosis96.36985966
Mean30.08020014
Median Absolute Deviation (MAD)3
Skewness9.587172676
Sum1472877
Variance38802.74373
MonotonicityNot monotonic
2022-11-29T15:45:43.900277image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
54709
9.4%
44402
 
8.8%
64375
 
8.8%
74295
 
8.6%
83922
 
7.8%
93523
 
7.0%
33466
 
6.9%
112996
 
6.0%
102982
 
6.0%
122585
 
5.2%
Other values (740)11710
23.4%
ValueCountFrequency (%)
01102
 
2.2%
11747
 
3.5%
22454
4.9%
33466
6.9%
44402
8.8%
54709
9.4%
64375
8.8%
74295
8.6%
83922
7.8%
93523
7.0%
ValueCountFrequency (%)
25931
< 0.1%
25921
< 0.1%
25881
< 0.1%
25861
< 0.1%
25831
< 0.1%
25761
< 0.1%
25751
< 0.1%
25741
< 0.1%
25701
< 0.1%
25671
< 0.1%

Credit_Mix
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
Standard
18379 
Good
12260 
_
9805 
Bad
9556 

Length

Max length8
Median length4
Mean length4.6909
Min length1

Characters and Unicode

Total characters234545
Distinct characters10
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGood
2nd rowGood
3rd rowGood
4th rowGood
5th rowGood

Common Values

ValueCountFrequency (%)
Standard18379
36.8%
Good12260
24.5%
_9805
19.6%
Bad9556
19.1%

Length

2022-11-29T15:45:44.089779image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-29T15:45:44.253320image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
standard18379
36.8%
good12260
24.5%
9805
19.6%
bad9556
19.1%

Most occurring characters

ValueCountFrequency (%)
d58574
25.0%
a46314
19.7%
o24520
10.5%
S18379
 
7.8%
t18379
 
7.8%
n18379
 
7.8%
r18379
 
7.8%
G12260
 
5.2%
_9805
 
4.2%
B9556
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter184545
78.7%
Uppercase Letter40195
 
17.1%
Connector Punctuation9805
 
4.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
d58574
31.7%
a46314
25.1%
o24520
13.3%
t18379
 
10.0%
n18379
 
10.0%
r18379
 
10.0%
Uppercase Letter
ValueCountFrequency (%)
S18379
45.7%
G12260
30.5%
B9556
23.8%
Connector Punctuation
ValueCountFrequency (%)
_9805
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin224740
95.8%
Common9805
 
4.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
d58574
26.1%
a46314
20.6%
o24520
10.9%
S18379
 
8.2%
t18379
 
8.2%
n18379
 
8.2%
r18379
 
8.2%
G12260
 
5.5%
B9556
 
4.3%
Common
ValueCountFrequency (%)
_9805
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII234545
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
d58574
25.0%
a46314
19.7%
o24520
10.5%
S18379
 
7.8%
t18379
 
7.8%
n18379
 
7.8%
r18379
 
7.8%
G12260
 
5.2%
_9805
 
4.2%
B9556
 
4.1%

Outstanding_Debt
Categorical

HIGH CARDINALITY
UNIFORM

Distinct12685
Distinct (%)25.4%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
1109.03
 
12
1151.7
 
12
1360.45
 
12
460.46
 
12
1428.31
 
8
Other values (12680)
49944 

Length

Max length8
Median length7
Mean length6.43302
Min length3

Characters and Unicode

Total characters321651
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique473 ?
Unique (%)0.9%

Sample

1st row809.98
2nd row809.98
3rd row809.98
4th row809.98
5th row605.03

Common Values

ValueCountFrequency (%)
1109.0312
 
< 0.1%
1151.712
 
< 0.1%
1360.4512
 
< 0.1%
460.4612
 
< 0.1%
1428.318
 
< 0.1%
950.598
 
< 0.1%
1334.818
 
< 0.1%
2329.288
 
< 0.1%
952.398
 
< 0.1%
1812.468
 
< 0.1%
Other values (12675)49904
99.8%

Length

2022-11-29T15:45:44.394516image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1109.0312
 
< 0.1%
1360.4512
 
< 0.1%
460.4612
 
< 0.1%
1151.712
 
< 0.1%
796.888
 
< 0.1%
1434.188
 
< 0.1%
1421.958
 
< 0.1%
852.748
 
< 0.1%
1381.18
 
< 0.1%
1423.888
 
< 0.1%
Other values (12193)49904
99.8%

Most occurring characters

ValueCountFrequency (%)
.50000
15.5%
141784
13.0%
231968
9.9%
329420
9.1%
429176
9.1%
524732
7.7%
624456
7.6%
824004
7.5%
723832
7.4%
923572
7.3%
Other values (2)18707
 
5.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number271160
84.3%
Other Punctuation50000
 
15.5%
Connector Punctuation491
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
141784
15.4%
231968
11.8%
329420
10.8%
429176
10.8%
524732
9.1%
624456
9.0%
824004
8.9%
723832
8.8%
923572
8.7%
018216
6.7%
Other Punctuation
ValueCountFrequency (%)
.50000
100.0%
Connector Punctuation
ValueCountFrequency (%)
_491
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common321651
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.50000
15.5%
141784
13.0%
231968
9.9%
329420
9.1%
429176
9.1%
524732
7.7%
624456
7.6%
824004
7.5%
723832
7.4%
923572
7.3%
Other values (2)18707
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII321651
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.50000
15.5%
141784
13.0%
231968
9.9%
329420
9.1%
429176
9.1%
524732
7.7%
624456
7.6%
824004
7.5%
723832
7.4%
923572
7.3%
Other values (2)18707
 
5.8%

Credit_Utilization_Ratio
Real number (ℝ≥0)

UNIQUE

Distinct50000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.27958112
Minimum20.50965206
Maximum48.54066309
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2022-11-29T15:45:44.576903image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum20.50965206
5-th percentile24.27433854
Q128.06104036
median32.28038958
Q336.46859096
95-th percentile40.24488238
Maximum48.54066309
Range28.03101103
Interquartile range (IQR)8.407550602

Descriptive statistics

Standard deviation5.106237733
Coefficient of variation (CV)0.1581878561
Kurtosis-0.9494207268
Mean32.27958112
Median Absolute Deviation (MAD)4.201862883
Skewness0.03759574349
Sum1613979.056
Variance26.07366378
MonotonicityNot monotonic
2022-11-29T15:45:44.774430image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.291389041
 
< 0.1%
31.509681221
 
< 0.1%
27.901201241
 
< 0.1%
28.669305861
 
< 0.1%
35.059345841
 
< 0.1%
33.147165321
 
< 0.1%
39.767097181
 
< 0.1%
37.461445021
 
< 0.1%
41.838877581
 
< 0.1%
24.257968811
 
< 0.1%
Other values (49990)49990
> 99.9%
ValueCountFrequency (%)
20.509652061
< 0.1%
20.620017321
< 0.1%
20.739225491
< 0.1%
20.800586851
< 0.1%
20.839226381
< 0.1%
20.919647981
< 0.1%
21.119669111
< 0.1%
21.140201931
< 0.1%
21.181581511
< 0.1%
21.187105261
< 0.1%
ValueCountFrequency (%)
48.540663091
< 0.1%
48.228714011
< 0.1%
48.152777491
< 0.1%
48.096457271
< 0.1%
48.065280661
< 0.1%
47.288987261
< 0.1%
47.230103591
< 0.1%
47.163172451
< 0.1%
46.977776381
< 0.1%
46.947533251
< 0.1%

Credit_History_Age
Categorical

HIGH CARDINALITY
MISSING

Distinct399
Distinct (%)0.9%
Missing4470
Missing (%)8.9%
Memory size390.8 KiB
16 Years and 1 Months
 
254
20 Years and 1 Months
 
254
18 Years and 7 Months
 
252
19 Years and 7 Months
 
252
18 Years and 6 Months
 
250
Other values (394)
44268 

Length

Max length22
Median length21
Mean length20.97537887
Min length20

Characters and Unicode

Total characters955009
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row22 Years and 9 Months
2nd row22 Years and 10 Months
3rd row23 Years and 0 Months
4th row27 Years and 3 Months
5th row27 Years and 4 Months

Common Values

ValueCountFrequency (%)
16 Years and 1 Months254
 
0.5%
20 Years and 1 Months254
 
0.5%
18 Years and 7 Months252
 
0.5%
19 Years and 7 Months252
 
0.5%
18 Years and 6 Months250
 
0.5%
16 Years and 6 Months248
 
0.5%
19 Years and 1 Months242
 
0.5%
18 Years and 1 Months241
 
0.5%
16 Years and 7 Months238
 
0.5%
20 Years and 0 Months236
 
0.5%
Other values (389)43063
86.1%
(Missing)4470
 
8.9%

Length

2022-11-29T15:45:44.995314image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
and45530
20.0%
months45530
20.0%
years45530
20.0%
66000
 
2.6%
75955
 
2.6%
14948
 
2.2%
84885
 
2.1%
94857
 
2.1%
104766
 
2.1%
114651
 
2.0%
Other values (28)54998
24.2%

Most occurring characters

ValueCountFrequency (%)
182120
19.1%
a91060
9.5%
s91060
9.5%
n91060
9.5%
o45530
 
4.8%
t45530
 
4.8%
Y45530
 
4.8%
e45530
 
4.8%
r45530
 
4.8%
d45530
 
4.8%
Other values (12)226529
23.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter546360
57.2%
Space Separator182120
 
19.1%
Decimal Number135469
 
14.2%
Uppercase Letter91060
 
9.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
136744
27.1%
222565
16.7%
313607
 
10.0%
012936
 
9.5%
69805
 
7.2%
79479
 
7.0%
88668
 
6.4%
98615
 
6.4%
46716
 
5.0%
56334
 
4.7%
Lowercase Letter
ValueCountFrequency (%)
a91060
16.7%
s91060
16.7%
n91060
16.7%
o45530
8.3%
t45530
8.3%
e45530
8.3%
r45530
8.3%
d45530
8.3%
h45530
8.3%
Uppercase Letter
ValueCountFrequency (%)
Y45530
50.0%
M45530
50.0%
Space Separator
ValueCountFrequency (%)
182120
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin637420
66.7%
Common317589
33.3%

Most frequent character per script

Common
ValueCountFrequency (%)
182120
57.3%
136744
 
11.6%
222565
 
7.1%
313607
 
4.3%
012936
 
4.1%
69805
 
3.1%
79479
 
3.0%
88668
 
2.7%
98615
 
2.7%
46716
 
2.1%
Latin
ValueCountFrequency (%)
a91060
14.3%
s91060
14.3%
n91060
14.3%
o45530
7.1%
t45530
7.1%
Y45530
7.1%
e45530
7.1%
r45530
7.1%
d45530
7.1%
M45530
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII955009
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
182120
19.1%
a91060
9.5%
s91060
9.5%
n91060
9.5%
o45530
 
4.8%
t45530
 
4.8%
Y45530
 
4.8%
e45530
 
4.8%
r45530
 
4.8%
d45530
 
4.8%
Other values (12)226529
23.7%

Payment_of_Min_Amount
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
Yes
26158 
No
17849 
NM
5993 

Length

Max length3
Median length3
Mean length2.52316
Min length2

Characters and Unicode

Total characters126158
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
Yes26158
52.3%
No17849
35.7%
NM5993
 
12.0%

Length

2022-11-29T15:45:45.162822image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-29T15:45:45.312393image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
yes26158
52.3%
no17849
35.7%
nm5993
 
12.0%

Most occurring characters

ValueCountFrequency (%)
Y26158
20.7%
e26158
20.7%
s26158
20.7%
N23842
18.9%
o17849
14.1%
M5993
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter70165
55.6%
Uppercase Letter55993
44.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
Y26158
46.7%
N23842
42.6%
M5993
 
10.7%
Lowercase Letter
ValueCountFrequency (%)
e26158
37.3%
s26158
37.3%
o17849
25.4%

Most occurring scripts

ValueCountFrequency (%)
Latin126158
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
Y26158
20.7%
e26158
20.7%
s26158
20.7%
N23842
18.9%
o17849
14.1%
M5993
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII126158
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
Y26158
20.7%
e26158
20.7%
s26158
20.7%
N23842
18.9%
o17849
14.1%
M5993
 
4.8%

Total_EMI_per_month
Real number (ℝ≥0)

ZEROS

Distinct13144
Distinct (%)26.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1491.304305
Minimum0
Maximum82398
Zeros5002
Zeros (%)10.0%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2022-11-29T15:45:45.476960image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q132.22238767
median74.73334891
Q3176.1574914
95-th percentile683.4115255
Maximum82398
Range82398
Interquartile range (IQR)143.9351037

Descriptive statistics

Standard deviation8595.647887
Coefficient of variation (CV)5.763845687
Kurtosis49.80255452
Mean1491.304305
Median Absolute Deviation (MAD)55.07584522
Skewness6.946275256
Sum74565215.26
Variance73885162.59
MonotonicityNot monotonic
2022-11-29T15:45:45.667587image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
05002
 
10.0%
54.701277414
 
< 0.1%
93.601617434
 
< 0.1%
96.26557194
 
< 0.1%
92.425471384
 
< 0.1%
244.42007384
 
< 0.1%
104.74617744
 
< 0.1%
69.37983694
 
< 0.1%
45.174379384
 
< 0.1%
85.886047164
 
< 0.1%
Other values (13134)44962
89.9%
ValueCountFrequency (%)
05002
10.0%
4.4628374674
 
< 0.1%
4.7131835724
 
< 0.1%
4.8656896774
 
< 0.1%
4.9161385424
 
< 0.1%
5.1384846964
 
< 0.1%
5.2184663594
 
< 0.1%
5.249273274
 
< 0.1%
5.2622910484
 
< 0.1%
5.3510861514
 
< 0.1%
ValueCountFrequency (%)
823981
< 0.1%
823471
< 0.1%
823161
< 0.1%
822481
< 0.1%
822351
< 0.1%
822251
< 0.1%
820911
< 0.1%
820711
< 0.1%
820231
< 0.1%
820161
< 0.1%

Amount_invested_monthly
Categorical

HIGH CARDINALITY
MISSING

Distinct45450
Distinct (%)95.2%
Missing2271
Missing (%)4.5%
Memory size390.8 KiB
__10000__
 
2175
0.0
 
106
146.45223558174197
 
1
124.69037914425093
 
1
263.87560234624675
 
1
Other values (45445)
45445 

Length

Max length18
Median length17
Mean length16.95711203
Min length3

Characters and Unicode

Total characters809346
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45448 ?
Unique (%)95.2%

Sample

1st row236.64268203272135
2nd row21.465380264657146
3rd row148.23393788500925
4th row39.08251089460281
5th row39.684018417945296

Common Values

ValueCountFrequency (%)
__10000__2175
 
4.3%
0.0106
 
0.2%
146.452235581741971
 
< 0.1%
124.690379144250931
 
< 0.1%
263.875602346246751
 
< 0.1%
36.3675288305150061
 
< 0.1%
82.412384034285131
 
< 0.1%
437.9636865809761
 
< 0.1%
34.194670925962171
 
< 0.1%
101.726549924070841
 
< 0.1%
Other values (45440)45440
90.9%
(Missing)2271
 
4.5%

Length

2022-11-29T15:45:45.856325image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
100002175
 
4.6%
0.0106
 
0.2%
80.477990722483161
 
< 0.1%
47.778256918951261
 
< 0.1%
428.21407842682831
 
< 0.1%
220.23125216778131
 
< 0.1%
416.67486208474731
 
< 0.1%
72.905844123382011
 
< 0.1%
438.312593410878831
 
< 0.1%
58.496941068639181
 
< 0.1%
Other values (45440)45440
95.2%

Most occurring characters

ValueCountFrequency (%)
186365
10.7%
277857
9.6%
476027
9.4%
375728
9.4%
075606
9.3%
674699
9.2%
574363
9.2%
872540
9.0%
772532
9.0%
969375
8.6%
Other values (2)54254
6.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number755092
93.3%
Other Punctuation45554
 
5.6%
Connector Punctuation8700
 
1.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
186365
11.4%
277857
10.3%
476027
10.1%
375728
10.0%
075606
10.0%
674699
9.9%
574363
9.8%
872540
9.6%
772532
9.6%
969375
9.2%
Other Punctuation
ValueCountFrequency (%)
.45554
100.0%
Connector Punctuation
ValueCountFrequency (%)
_8700
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common809346
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
186365
10.7%
277857
9.6%
476027
9.4%
375728
9.4%
075606
9.3%
674699
9.2%
574363
9.2%
872540
9.0%
772532
9.0%
969375
8.6%
Other values (2)54254
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII809346
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
186365
10.7%
277857
9.6%
476027
9.4%
375728
9.4%
075606
9.3%
674699
9.2%
574363
9.2%
872540
9.0%
772532
9.0%
969375
8.6%
Other values (2)54254
6.7%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
Low_spent_Small_value_payments
12694 
High_spent_Medium_value_payments
8922 
High_spent_Large_value_payments
6844 
Low_spent_Medium_value_payments
6837 
High_spent_Small_value_payments
5651 
Other values (2)
9052 

Length

Max length32
Median length31
Mean length28.91952
Min length6

Characters and Unicode

Total characters1445976
Distinct characters29
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLow_spent_Small_value_payments
2nd rowHigh_spent_Medium_value_payments
3rd rowLow_spent_Medium_value_payments
4th rowHigh_spent_Medium_value_payments
5th rowHigh_spent_Large_value_payments

Common Values

ValueCountFrequency (%)
Low_spent_Small_value_payments12694
25.4%
High_spent_Medium_value_payments8922
17.8%
High_spent_Large_value_payments6844
13.7%
Low_spent_Medium_value_payments6837
13.7%
High_spent_Small_value_payments5651
11.3%
Low_spent_Large_value_payments5252
10.5%
!@9#%83800
 
7.6%

Length

2022-11-29T15:45:46.019547image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-29T15:45:46.196324image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
low_spent_small_value_payments12694
25.4%
high_spent_medium_value_payments8922
17.8%
high_spent_large_value_payments6844
13.7%
low_spent_medium_value_payments6837
13.7%
high_spent_small_value_payments5651
11.3%
low_spent_large_value_payments5252
10.5%
9#%83800
 
7.6%

Most occurring characters

ValueCountFrequency (%)
_184800
12.8%
e166455
11.5%
a122841
 
8.5%
s92400
 
6.4%
p92400
 
6.4%
n92400
 
6.4%
t92400
 
6.4%
l82890
 
5.7%
m80304
 
5.6%
u61959
 
4.3%
Other values (19)377127
26.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1145976
79.3%
Connector Punctuation184800
 
12.8%
Uppercase Letter92400
 
6.4%
Other Punctuation15200
 
1.1%
Decimal Number7600
 
0.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e166455
14.5%
a122841
10.7%
s92400
8.1%
p92400
8.1%
n92400
8.1%
t92400
8.1%
l82890
 
7.2%
m80304
 
7.0%
u61959
 
5.4%
v46200
 
4.0%
Other values (8)215727
18.8%
Uppercase Letter
ValueCountFrequency (%)
L36879
39.9%
H21417
23.2%
S18345
19.9%
M15759
17.1%
Other Punctuation
ValueCountFrequency (%)
!3800
25.0%
@3800
25.0%
#3800
25.0%
%3800
25.0%
Decimal Number
ValueCountFrequency (%)
93800
50.0%
83800
50.0%
Connector Punctuation
ValueCountFrequency (%)
_184800
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1238376
85.6%
Common207600
 
14.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e166455
13.4%
a122841
 
9.9%
s92400
 
7.5%
p92400
 
7.5%
n92400
 
7.5%
t92400
 
7.5%
l82890
 
6.7%
m80304
 
6.5%
u61959
 
5.0%
v46200
 
3.7%
Other values (12)308127
24.9%
Common
ValueCountFrequency (%)
_184800
89.0%
!3800
 
1.8%
@3800
 
1.8%
93800
 
1.8%
#3800
 
1.8%
%3800
 
1.8%
83800
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII1445976
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_184800
12.8%
e166455
11.5%
a122841
 
8.5%
s92400
 
6.4%
p92400
 
6.4%
n92400
 
6.4%
t92400
 
6.4%
l82890
 
5.7%
m80304
 
5.6%
u61959
 
4.3%
Other values (19)377127
26.1%

Monthly_Balance
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct49433
Distinct (%)> 99.9%
Missing562
Missing (%)1.1%
Memory size390.8 KiB
__-333333333333333333333333333__
 
6
329.8161038704352
 
1
403.8475175077663
 
1
382.1571836917757
 
1
203.42087912420956
 
1
Other values (49428)
49428 

Length

Max length32
Median length17
Mean length17.34234799
Min length13

Characters and Unicode

Total characters857371
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique49432 ?
Unique (%)> 99.9%

Sample

1st row186.26670208571772
2nd row361.44400385378196
3rd row264.67544623342997
4th row343.82687322383634
5th row485.2984336755923

Common Values

ValueCountFrequency (%)
__-333333333333333333333333333__6
 
< 0.1%
329.81610387043521
 
< 0.1%
403.84751750776631
 
< 0.1%
382.15718369177571
 
< 0.1%
203.420879124209561
 
< 0.1%
300.79438920008481
 
< 0.1%
391.686069822345531
 
< 0.1%
728.46417747298281
 
< 0.1%
265.76792027002991
 
< 0.1%
500.198463833113861
 
< 0.1%
Other values (49423)49423
98.8%
(Missing)562
 
1.1%

Length

2022-11-29T15:45:46.391691image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
3333333333333333333333333336
 
< 0.1%
385.30043811615531
 
< 0.1%
300.52268664600171
 
< 0.1%
459.948814809955651
 
< 0.1%
314.864613414161971
 
< 0.1%
293.74756359014771
 
< 0.1%
520.09151757785591
 
< 0.1%
260.40648831498281
 
< 0.1%
388.79367053259991
 
< 0.1%
595.2209591477161
 
< 0.1%
Other values (49423)49423
> 99.9%

Most occurring characters

ValueCountFrequency (%)
390947
10.6%
290118
10.5%
484677
9.9%
581313
9.5%
680808
9.4%
778501
9.2%
177986
9.1%
877928
9.1%
974663
8.7%
070968
8.3%
Other values (3)49462
5.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number807909
94.2%
Other Punctuation49432
 
5.8%
Connector Punctuation24
 
< 0.1%
Dash Punctuation6
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
390947
11.3%
290118
11.2%
484677
10.5%
581313
10.1%
680808
10.0%
778501
9.7%
177986
9.7%
877928
9.6%
974663
9.2%
070968
8.8%
Other Punctuation
ValueCountFrequency (%)
.49432
100.0%
Connector Punctuation
ValueCountFrequency (%)
_24
100.0%
Dash Punctuation
ValueCountFrequency (%)
-6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common857371
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
390947
10.6%
290118
10.5%
484677
9.9%
581313
9.5%
680808
9.4%
778501
9.2%
177986
9.1%
877928
9.1%
974663
8.7%
070968
8.3%
Other values (3)49462
5.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII857371
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
390947
10.6%
290118
10.5%
484677
9.9%
581313
9.5%
680808
9.4%
778501
9.2%
177986
9.1%
877928
9.1%
974663
8.7%
070968
8.3%
Other values (3)49462
5.8%

Interactions

2022-11-29T15:45:35.047744image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:23.277780image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:24.975818image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:26.696116image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:28.297235image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:29.915519image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:31.646046image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:33.376609image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:35.247391image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:23.541972image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:25.195247image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:26.908726image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:28.503433image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:30.144495image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:31.997272image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:33.576856image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:35.442383image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:23.752951image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:25.406298image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:27.106347image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:28.703094image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:30.371745image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:32.189564image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:33.793361image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:35.648057image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:23.953736image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:25.597882image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:27.318919image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:28.895645image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:30.586939image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:32.413904image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:34.053063image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:35.864020image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:24.160412image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:25.795730image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:27.520792image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:29.107041image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:30.798550image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:32.601338image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:34.263091image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:36.062980image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:24.367492image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:26.126911image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:27.728376image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:29.327552image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:31.011607image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:32.802535image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:34.479488image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:36.247475image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:24.572693image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:26.314776image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:27.930199image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:29.522487image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:31.241100image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:33.012057image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:34.676664image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:36.430780image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:24.775033image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:26.507670image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:28.108144image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:29.719285image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:31.457499image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:33.205252image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-29T15:45:34.863821image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2022-11-29T15:45:46.548272image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-11-29T15:45:46.757522image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-11-29T15:45:47.139985image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-11-29T15:45:47.336846image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-11-29T15:45:47.534370image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-11-29T15:45:36.786900image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-11-29T15:45:37.824580image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-11-29T15:45:38.463692image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-11-29T15:45:38.858533image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

IDCustomer_IDMonthNameAgeSSNOccupationAnnual_IncomeMonthly_Inhand_SalaryNum_Bank_AccountsNum_Credit_CardInterest_RateNum_of_LoanType_of_LoanDelay_from_due_dateNum_of_Delayed_PaymentChanged_Credit_LimitNum_Credit_InquiriesCredit_MixOutstanding_DebtCredit_Utilization_RatioCredit_History_AgePayment_of_Min_AmountTotal_EMI_per_monthAmount_invested_monthlyPayment_BehaviourMonthly_Balance
00x160aCUS_0xd40SeptemberAaron Maashoh23821-00-0265Scientist19114.121824.8433333434Auto Loan, Credit-Builder Loan, Personal Loan, and Home Equity Loan3711.272022.0Good809.9835.03040222 Years and 9 MonthsNo49.574949236.64268203272135Low_spent_Small_value_payments186.26670208571772
10x160bCUS_0xd40OctoberAaron Maashoh24821-00-0265Scientist19114.121824.8433333434Auto Loan, Credit-Builder Loan, Personal Loan, and Home Equity Loan3913.274.0Good809.9833.05311422 Years and 10 MonthsNo49.57494921.465380264657146High_spent_Medium_value_payments361.44400385378196
20x160cCUS_0xd40NovemberAaron Maashoh24821-00-0265Scientist19114.121824.8433333434Auto Loan, Credit-Builder Loan, Personal Loan, and Home Equity Loan-1412.274.0Good809.9833.811894NaNNo49.574949148.23393788500925Low_spent_Medium_value_payments264.67544623342997
30x160dCUS_0xd40DecemberAaron Maashoh24_821-00-0265Scientist19114.12NaN3434Auto Loan, Credit-Builder Loan, Personal Loan, and Home Equity Loan4511.274.0Good809.9832.43055923 Years and 0 MonthsNo49.57494939.08251089460281High_spent_Medium_value_payments343.82687322383634
40x1616CUS_0x21b1SeptemberRick Rothackerj28004-07-5839_______34847.843037.9866672461Credit-Builder Loan315.425.0Good605.0325.92682227 Years and 3 MonthsNo18.81621539.684018417945296High_spent_Large_value_payments485.2984336755923
50x1617CUS_0x21b1OctoberRick Rothackerj28#F%$D@*&8Teacher34847.843037.9866672461Credit-Builder Loan335.425.0Good605.0330.11660027 Years and 4 MonthsNo18.816215251.62736875017606Low_spent_Large_value_payments303.3550833433617
60x1618CUS_0x21b1NovemberRick Rothackerj28004-07-5839Teacher34847.843037.9866672461Credit-Builder Loan3NaN5.425.0_605.0330.99642427 Years and 5 MonthsNo18.81621572.68014533363515High_spent_Large_value_payments452.30230675990265
70x1619CUS_0x21b1DecemberRick Rothackerj28004-07-5839Teacher34847.843037.9866672461Credit-Builder Loan32_7.425.0_605.0333.87516727 Years and 6 MonthsNo18.816215153.53448761392985!@9#%8421.44796447960783
80x1622CUS_0x2dbcSeptemberLangep35486-85-3974Engineer143162.64NaN1583Auto Loan, Auto Loan, and Not Specified819427.13.0Good1303.0135.22970718 Years and 5 MonthsNo246.992319397.50365354404653Low_spent_Medium_value_payments854.2260270022115
90x1623CUS_0x2dbcOctoberLangep35486-85-3974Engineer143162.6412187.2200001583Auto Loan, Auto Loan, and Not Specified632.13.0Good1303.0135.68583618 Years and 6 MonthsNo246.992319453.6151305781054Low_spent_Large_value_payments788.1145499681528

Last rows

IDCustomer_IDMonthNameAgeSSNOccupationAnnual_IncomeMonthly_Inhand_SalaryNum_Bank_AccountsNum_Credit_CardInterest_RateNum_of_LoanType_of_LoanDelay_from_due_dateNum_of_Delayed_PaymentChanged_Credit_LimitNum_Credit_InquiriesCredit_MixOutstanding_DebtCredit_Utilization_RatioCredit_History_AgePayment_of_Min_AmountTotal_EMI_per_monthAmount_invested_monthlyPayment_BehaviourMonthly_Balance
499900x25fd8CUS_0xaf61NovemberChris Wickhamm50133-16-7738Writer37188.13097.0083331442523Home Equity Loan, Mortgage Loan, and Student Loan7125.383.0Good620.6425.70841430 Years and 7 MonthsNo84.205949183.3656280777276Low_spent_Large_value_payments312.1292558307615
499910x25fd9CUS_0xaf61DecemberChris Wickhamm50_133-16-7738Writer37188.13097.0083331453Home Equity Loan, Mortgage Loan, and Student Loan3125.383.0_620.6436.49838330 Years and 8 MonthsNo33013.000000238.3993828976901Low_spent_Large_value_payments257.095501010799
499920x25fe2CUS_0x8600SeptemberSarah McBridec29031-35-0942Architect20002.881929.906667108295Personal Loan, Auto Loan, Mortgage Loan, Student Loan, and Student Loan332518.319.0Bad3571.732.3912886 Years and 4 MonthsYes60.964772107.21074164760236Low_spent_Small_value_payments314.8151526456419
499930x25fe3CUS_0x8600OctoberSarah McBridec29031-35-0942Architect20002.881929.906667108295Personal Loan, Auto Loan, Mortgage Loan, Student Loan, and Student Loan332518.3112.0Bad3571.737.5285116 Years and 5 MonthsYes60.96477271.79442082882734Low_spent_Small_value_payments350.23147346441687
499940x25fe4CUS_0x8600NovemberSarah McBridec29031-35-0942_______20002.881929.906667108295Personal Loan, Auto Loan, Mortgage Loan, Student Loan, and Student Loan332218.3112.0Bad3571.727.0278126 Years and 6 MonthsYes60.96477250.84684680498023High_spent_Small_value_payments341.179047488264
499950x25fe5CUS_0x8600DecemberSarah McBridec4975031-35-0942Architect20002.881929.906667108295Personal Loan, Auto Loan, Mortgage Loan, Student Loan, and Student Loan332518.3112.0_3571.734.780553NaNYes60.964772146.48632477751087Low_spent_Small_value_payments275.53956951573343
499960x25feeCUS_0x942cSeptemberNicks25078-73-5990Mechanic39628.99NaN4672_Auto Loan, and Student Loan20NaN11.57.0Good502.3827.75852231 Years and 11 MonthsNM35.104023181.44299902757518Low_spent_Small_value_payments409.39456169535066
499970x25fefCUS_0x942cOctoberNicks25078-73-5990Mechanic39628.993359.4158334672Auto Loan, and Student Loan23513.57.0Good502.3836.85854232 Years and 0 MonthsNo35.104023__10000__Low_spent_Large_value_payments349.7263321025098
499980x25ff0CUS_0x942cNovemberNicks25078-73-5990Mechanic39628.99NaN4672_Auto Loan, and Student Loan216_11.57.0Good502.3839.13984032 Years and 1 MonthsNo35.10402397.59857973344877High_spent_Small_value_payments463.23898098947717
499990x25ff1CUS_0x942cDecemberNicks25078-73-5990Mechanic39628.993359.4158334672Auto Loan, and Student Loan22511.57.0_502.3834.10853032 Years and 2 MonthsNo35.104023220.45787812168732Low_spent_Medium_value_payments360.37968260123847